yaImpute: An R Package for kNN Imputation

نویسندگان

  • Nicholas L. Crookston
  • Andrew O. Finley
چکیده

This article introduces yaImpute, an R package for nearest neighbor search and imputation. Although nearest neighbor imputation is used in a host of disciplines, the methods implemented in the yaImpute package are tailored to imputation-based forest attribute estimation and mapping. The impetus to writing the yaImpute is a growing interest in nearest neighbor imputation methods for spatially explicit forest inventory, and a need within this research community for software that facilitates comparison among different nearest neighbor search algorithms and subsequent imputation techniques. yaImpute provides directives for defining the search space, subsequent distance calculation, and imputation rules for a given number of nearest neighbors. Further, the package offers a suite of diagnostics for comparison among results generated from different imputation analyses and a set of functions for mapping imputation results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Imputation of parent-offspring trios and their effect on accuracy of genomic prediction using Bayesian method

The objective of this study was to evaluate the imputation accuracy of parent-offspring trios under different scenarios. By using simulated datasets, the performance Bayesian LASSO in genomic prediction was also examined. The genome consisted of 5 chromosomes and each chromosome was set as 1 Morgan length. The number of SNPs per chromosome was 10000. One hundred QTLs were randomly distributed a...

متن کامل

KNN Classification and Regression using SAS

K-Nearest Neighbor (KNN) classification and regression are two widely used analytic methods in predictive modeling and data mining fields. They provide a way to model highly nonlinear decision boundaries, and to fulfill many other analytical tasks such as missing value imputation, local smoothing, etc. In this paper, we discuss ways in SAS R © to conduct KNN classification and KNN Regression. S...

متن کامل

Inferring missing genotypes in large SNP panels using fast nearest-neighbor searches over sliding windows

MOTIVATION Typical high-throughput genotyping techniques produce numerous missing calls that confound subsequent analyses, such as disease association studies. Common remedies for this problem include removing affected markers and/or samples or, otherwise, imputing the missing data. On small marker sets imputation is frequently based on a vote of the K-nearest-neighbor (KNN) haplotypes, but thi...

متن کامل

rknn: an R Package for Parallel Random KNN Classification with Variable Selection

Random KNN (RKNN) is a novel generalization of traditional nearest-neighbor modeling. Random KNN consists of an ensemble of base k-nearest neighbor models, each constructed from a random subset of the input variables. A collection of r such base classifiers is combined to build the final Random KNN classifier. Since the base classifiers can be computed independently of one another, the overall ...

متن کامل

Combining kNN Imputation and Bootstrap Calibrated: Empirical Likelihood for Incomplete Data Analysis

The k-nearest neighbor (kNN) imputation, as one of the most important research topics in incomplete data discovery, has been developed with great successes on industrial data. However, it is difficult to obtain a mathematical valid and simple procedure to construct confidence intervals for evaluating the imputed data. This chapter studies a new estimation for missing (or incomplete) data that i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008